311 research outputs found

    Spatial Dynamics of Human-Origin H1 Influenza A Virus in North American Swine

    Get PDF
    The emergence and rapid global spread of the swine-origin H1N1/09 pandemic influenza A virus in humans underscores the importance of swine populations as reservoirs for genetically diverse influenza viruses with the potential to infect humans. However, despite their significance for animal and human health, relatively little is known about the phylogeography of swine influenza viruses in the United States. This study utilizes an expansive data set of hemagglutinin (HA1) sequences (nβ€Š=β€Š1516) from swine influenza viruses collected in North America during the period 2003–2010. With these data we investigate the spatial dissemination of a novel influenza virus of the H1 subtype that was introduced into the North American swine population via two separate human-to-swine transmission events around 2003. Bayesian phylogeographic analysis reveals that the spatial dissemination of this influenza virus in the US swine population follows long-distance swine movements from the Southern US to the Midwest, a corn-rich commercial center that imports millions of swine annually. Hence, multiple genetically diverse influenza viruses are introduced and co-circulate in the Midwest, providing the opportunity for genomic reassortment. Overall, the Midwest serves primarily as an ecological sink for swine influenza in the US, with sources of virus genetic diversity instead located in the Southeast (mainly North Carolina) and South-central (mainly Oklahoma) regions. Understanding the importance of long-distance pig transportation in the evolution and spatial dissemination of the influenza virus in swine may inform future strategies for the surveillance and control of influenza, and perhaps other swine pathogens

    Evolutionary distances in the twilight zone -- a rational kernel approach

    Get PDF
    Phylogenetic tree reconstruction is traditionally based on multiple sequence alignments (MSAs) and heavily depends on the validity of this information bottleneck. With increasing sequence divergence, the quality of MSAs decays quickly. Alignment-free methods, on the other hand, are based on abstract string comparisons and avoid potential alignment problems. However, in general they are not biologically motivated and ignore our knowledge about the evolution of sequences. Thus, it is still a major open question how to define an evolutionary distance metric between divergent sequences that makes use of indel information and known substitution models without the need for a multiple alignment. Here we propose a new evolutionary distance metric to close this gap. It uses finite-state transducers to create a biologically motivated similarity score which models substitutions and indels, and does not depend on a multiple sequence alignment. The sequence similarity score is defined in analogy to pairwise alignments and additionally has the positive semi-definite property. We describe its derivation and show in simulation studies and real-world examples that it is more accurate in reconstructing phylogenies than competing methods. The result is a new and accurate way of determining evolutionary distances in and beyond the twilight zone of sequence alignments that is suitable for large datasets.Comment: to appear in PLoS ON

    Modeling of the Temporal Patterns of Fluoxetine Prescriptions and Suicide Rates in the United States

    Get PDF
    BACKGROUND: To study the potential association of antidepressant use and suicide at a population level, we analyzed the associations between suicide rates and dispensing of the prototypic SSRI antidepressant fluoxetine in the United States during the period 1960–2002. METHODS AND FINDINGS: Sources of data included Centers of Disease Control and US Census Bureau age-adjusted suicide rates since 1960 and numbers of fluoxetine sales in the US, since its introduction in 1988. We conducted statistical analysis of age-adjusted population data and prescription numbers. Suicide rates fluctuated between 12.2 and 13.7 per 100,000 for the entire population from the early 1960s until 1988. Since then, suicide rates have gradually declined, with the lowest value of 10.4 per 100,000 in 2000. This steady decline is significantly associated with increased numbers of fluoxetine prescriptions dispensed from 2,469,000 in 1988 to 33,320,000 in 2002 (r(s) = βˆ’0.92; p < 0.001). Mathematical modeling of what suicide rates would have been during the 1988–2002 period based on pre-1988 data indicates that since the introduction of fluoxetine in 1988 through 2002 there has been a cumulative decrease in expected suicide mortality of 33,600 individuals (posterior median, 95% Bayesian credible interval 22,400–45,000). CONCLUSIONS: The introduction of SSRIs in 1988 has been temporally associated with a substantial reduction in the number of suicides. This effect may have been more apparent in the female population, whom we postulate might have particularly benefited from SSRI treatment. While these types of data cannot lead to conclusions on causality, we suggest here that in the context of untreated depression being the major cause of suicide, antidepressant treatment could have had a contributory role in the reduction of suicide rates in the period 1988–2002

    Bayesian modeling of recombination events in bacterial populations

    Get PDF
    Background: We consider the discovery of recombinant segments jointly with their origins within multilocus DNA sequences from bacteria representing heterogeneous populations of fairly closely related species. The currently available methods for recombination detection capable of probabilistic characterization of uncertainty have a limited applicability in practice as the number of strains in a data set increases. Results: We introduce a Bayesian spatial structural model representing the continuum of origins over sites within the observed sequences, including a probabilistic characterization of uncertainty related to the origin of any particular site. To enable a statistically accurate and practically feasible approach to the analysis of large-scale data sets representing a single genus, we have developed a novel software tool (BRAT, Bayesian Recombination Tracker) implementing the model and the corresponding learning algorithm, which is capable of identifying the posterior optimal structure and to estimate the marginal posterior probabilities of putative origins over the sites. Conclusion: A multitude of challenging simulation scenarios and an analysis of real data from seven housekeeping genes of 120 strains of genus Burkholderia are used to illustrate the possibilities offered by our approach. The software is freely available for download at URL http://web.abo.fi/fak/ mnf//mate/jc/software/brat.html

    The Dawn of Open Access to Phylogenetic Data

    Get PDF
    The scientific enterprise depends critically on the preservation of and open access to published data. This basic tenet applies acutely to phylogenies (estimates of evolutionary relationships among species). Increasingly, phylogenies are estimated from increasingly large, genome-scale datasets using increasingly complex statistical methods that require increasing levels of expertise and computational investment. Moreover, the resulting phylogenetic data provide an explicit historical perspective that critically informs research in a vast and growing number of scientific disciplines. One such use is the study of changes in rates of lineage diversification (speciation - extinction) through time. As part of a meta-analysis in this area, we sought to collect phylogenetic data (comprising nucleotide sequence alignment and tree files) from 217 studies published in 46 journals over a 13-year period. We document our attempts to procure those data (from online archives and by direct request to corresponding authors), and report results of analyses (using Bayesian logistic regression) to assess the impact of various factors on the success of our efforts. Overall, complete phylogenetic data for ~60% of these studies are effectively lost to science. Our study indicates that phylogenetic data are more likely to be deposited in online archives and/or shared upon request when: (1) the publishing journal has a strong data-sharing policy; (2) the publishing journal has a higher impact factor, and; (3) the data are requested from faculty rather than students. Although the situation appears dire, our analyses suggest that it is far from hopeless: recent initiatives by the scientific community -- including policy changes by journals and funding agencies -- are improving the state of affairs

    Phylogeography of Japanese encephalitis virus:genotype is associated with climate

    Get PDF
    The circulation of vector-borne zoonotic viruses is largely determined by the overlap in the geographical distributions of virus-competent vectors and reservoir hosts. What is less clear are the factors influencing the distribution of virus-specific lineages. Japanese encephalitis virus (JEV) is the most important etiologic agent of epidemic encephalitis worldwide, and is primarily maintained between vertebrate reservoir hosts (avian and swine) and culicine mosquitoes. There are five genotypes of JEV: GI-V. In recent years, GI has displaced GIII as the dominant JEV genotype and GV has re-emerged after almost 60 years of undetected virus circulation. JEV is found throughout most of Asia, extending from maritime Siberia in the north to Australia in the south, and as far as Pakistan to the west and Saipan to the east. Transmission of JEV in temperate zones is epidemic with the majority of cases occurring in summer months, while transmission in tropical zones is endemic and occurs year-round at lower rates. To test the hypothesis that viruses circulating in these two geographical zones are genetically distinct, we applied Bayesian phylogeographic, categorical data analysis and phylogeny-trait association test techniques to the largest JEV dataset compiled to date, representing the envelope (E) gene of 487 isolates collected from 12 countries over 75 years. We demonstrated that GIII and the recently emerged GI-b are temperate genotypes likely maintained year-round in northern latitudes, while GI-a and GII are tropical genotypes likely maintained primarily through mosquito-avian and mosquito-swine transmission cycles. This study represents a new paradigm directly linking viral molecular evolution and climate

    Accurate reconstruction of insertion-deletion histories by statistical phylogenetics

    Get PDF
    The Multiple Sequence Alignment (MSA) is a computational abstraction that represents a partial summary either of indel history, or of structural similarity. Taking the former view (indel history), it is possible to use formal automata theory to generalize the phylogenetic likelihood framework for finite substitution models (Dayhoff's probability matrices and Felsenstein's pruning algorithm) to arbitrary-length sequences. In this paper, we report results of a simulation-based benchmark of several methods for reconstruction of indel history. The methods tested include a relatively new algorithm for statistical marginalization of MSAs that sums over a stochastically-sampled ensemble of the most probable evolutionary histories. For mammalian evolutionary parameters on several different trees, the single most likely history sampled by our algorithm appears less biased than histories reconstructed by other MSA methods. The algorithm can also be used for alignment-free inference, where the MSA is explicitly summed out of the analysis. As an illustration of our method, we discuss reconstruction of the evolutionary histories of human protein-coding genes.Comment: 28 pages, 15 figures. arXiv admin note: text overlap with arXiv:1103.434

    Evolutionary History and Population Dynamics of Hepatitis E Virus

    Get PDF
    BACKGROUND: Hepatitis E virus (HEV) is an enterically transmitted hepatropic virus. It segregates as four genotypes. All genotypes infect humans while only genotypes 3 and 4 also infect several animal species. It has been suggested that hepatitis E is zoonotic, but no study has analyzed the evolutionary history of HEV. We present here an analysis of the evolutionary history of HEV. METHODS AND FINDINGS: The times to the most recent common ancestors for all four genotypes of HEV were calculated using BEAST to conduct a Bayesian analysis of HEV. The population dynamics for genotypes 1, 3 and 4 were analyzed using skyline plots. Bayesian analysis showed that the most recent common ancestor for modern HEV existed between 536 and 1344 years ago. The progenitor of HEV appears to have given rise to anthropotropic and enzootic forms of HEV, which evolved into genotypes 1 and 2 and genotypes 3 and 4, respectively. Population dynamics suggest that genotypes 1, 3 and 4 experienced a population expansion during the 20(th) century. Genotype 1 has increased in infected population size ∼30-35 years ago. Genotype 3 and 4 have experienced an increase in population size starting late in the 19(th) century until ca.1940-45, with genotype 3 having undergone additional rapid expansion until ca.1960. The effective population size for both genotype 3 and 4 rapidly declined to pre-expansion levels starting in ca.1990. Genotype 4 was further examined as Chinese and Japanese sequences, which exhibited different population dynamics, suggesting that this genotype experienced different evolutionary history in these two countries. CONCLUSIONS: HEV appears to have evolved through a series of steps, in which the ancestors of HEV may have adapted to a succession of animal hosts leading to humans. Analysis of the population dynamics of HEV suggests a substantial temporal variation in the rate of transmission among HEV genotypes in different geographic regions late in the 20(th) Century

    The HIV-1 Subtype C Epidemic in South America Is Linked to the United Kingdom

    Get PDF
    Background: The global spread of HIV-1 has been accompanied by the emergence of genetically distinct viral strains. Over the past two decades subtype C viruses, which predominate in Southern and Eastern Africa, have spread rapidly throughout parts of South America. Phylogenetic studies indicate that subtype C viruses were introduced to South America through a single founder event that occurred in Southern Brazil. However, the external route via which subtype C viruses spread to the South American continent has remained unclear.Methodology/Principal Findings: We used automated genotyping to screen 8,309 HIV-1 subtype C pol gene sequences sampled within the UK for isolates genetically linked to the subtype C epidemic in South America. Maximum likelihood and Bayesian approaches were used to explore the phylogenetic relationships between 54 sequences identified in this screen, and a set of globally sampled subtype C reference sequences. Phylogenetic trees disclosed a robustly supported relationship between sequences from Brazil, the UK and East Africa. A monophyletic cluster comprised exclusively of sequences from the UK and Brazil was identified and dated to approximately the early 1980s using a Bayesian coalescent-based method. A sub-cluster of 27 sequences isolated from homosexual men of UK origin was also identified and dated to the early 1990s.Conclusions: Phylogenetic, demographic and temporal data support the conclusion that the UK was a crucial staging post in the spread of subtype C from East Africa to South America. This unexpected finding demonstrates the role of diffuse international networks in the global spread of HIV-1 infection, and the utility of globally sampled viral sequence data in revealing these networks. Additionally, we show that subtype C viruses are spreading within the UK amongst men who have sex with men
    • …
    corecore